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Combating dynamic threats 


Modern threats are evanescent, incredibly short lived and ever changing. Signature-based detection, even 
using regular expressions, cannot identify a wide range of current threats. It is not the right toolset, as 
signatures would have to be developed, monitored, and updated for specific instances of a wide variety of 
threats. Even still, these signatures would often be outdated as soon as they are released because of the 
dynamic nature of modern threats. Statistical models effectively close these security gaps. 
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The ability to evaluate security filters that represent statistical models o 
improve the security effectiveness of technologies like Next-Generation 
security managers are faced with a series of challenges: increasingly so 
and a lack of visibility across their security systems. 


The following use cases address some of the security gaps that cannot 


detection solutions: malicious HTML content including JavaScript, malici 


including Flash and PDF. Statistical models can be used to identify all of 


EXPLOIT KITS AND OTHER MALICIOUS OBFUSCATED HTML 


Live Stack testing from NSS Labs makes use of anumber of exploit kits, whi 
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ch account for over 80% of Live Stack 


testing misses. These exploit kits deliver malicious obfuscated HTML, with code objects encoded in some manner 
within the HTML document to be decoded by later JavaScript or VBScript evaluation. Delivering protection against 
exploit kits, which are designed to evade detection by regular expressions, provides a substantial increase in security 


effectiveness to these prevalent and growing threats’. 


MALICIOUS OBFUSCATED JAVASCRIPT 





JavaScript is increasingly used to deliver malicious content, including attacks that use JavaScript alone to accomplish 








malicious actions’. These scripts are obfuscated in order to evade detection 


by signatures or regular expressions. 


Statistical models can identify malicious obfuscated JavaScript, and close this gap. This is similar to the detection of 


obfuscated HTML, but focuses on malicious obfuscated JavaScript. This can 
and not just to HTML documents <script> elements. 


MALICIOUS FLASH OBJECTS 








Day Initiative (ZDI) bug bounty program are within Adobe products. Externa 


MALICIOUS PDF FILES 
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apply to importation of JavaScript files 


One of the largest gaps in security effectiveness in the NGIPS is the lack of coverage in identifying malicious Flash ob- 
jects. Malicious Flash objects and PDF files are widely used in attacks?. Many of the vulnerabilities disclosed to the Zero 


| research indicates that static analysis can 


detect malicious Flash files. Where static analysis is successful, statistical models can be applied. 





Another large gap in security effectiveness in the NGIPS is the lack of coverage in identifying malicious PDF files. Ex- 
ternal research indicates that static analysis can detect malicious PDFs. Because malicious PDFs often incorporate ma- 


uscated Javascript and 2.3 Malicious Flash 
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above is prerequisite to this effort. 
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MALICIOUS PORTABLE EXECUTABLE (PE) FILES 


Polymorphic malware results in over one million new malware samples per day*. Using PE headers alone, statisti- 
cal models can predict whether an executable is malicious with greater than 95% accuracy”. Internal research from 
Trend Micro's TippingPoint DVLabs team has verified that creation of statistical models that identify malicious PE files 
is Straightforward and effective. 














CUSTOM PACKED FILES 


Over 75% of malware executables are packed®. Regular expression filters can block files packed with off the shelf 
packing utilities, but malware authors are increasingly using custom polymorphic packers to evade detection. Cus- 
tom packed files can be detected by measuring the compressibility of the files (ibid). While compressibility will not be 
directly used because of the latency it would introduce, related complexity measures such as entropy may be used. 
This approach is simpler and more direct using a model based on PE imports because it relies on very few features, 
and possibly just one. 





USING MACHINE LEARNING TO 
ADDRESS UNKNOWN THREATS WITH NGIPS 





Across many industries machine learning techniques are being quickly adopted; however, Trend Micro is the first to 
leverage this capability to detect and eliminate some of the threats mentioned above in-line at wire speed through 
the TippingPoint Next-Generation Intrusion Prevention System (NGIPS) and Threat Protection System (TPS).7 Our 
revolutionary approach powered by XGen™ security provides an additional measure of security on top of traditional 
signature-based approaches to intrusion prevention. 


The following illustration details a very simplistic representation of machine learning capabilities using the 





Trend Micro TippingPoint solutions. 
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Illustration: Basics of machine learning and application to the TippingPoint NGIPS and TPS 


4 http://money.cnn.com/2015/04/14/technology/security/cyber-attack-hacks-security/ 
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Digital Vaccine® (DV) filter packages used by the TippingPoint NGIPS and 
TPS are a strong mechanism to detect network-based malicious activity, 
exploitation of vulnerabilities, and unwanted application use. However, as 
the TippingPoint solutions block these critical attacks more effectively, 
exploit kit authors adjusted their tactics to evade traditional signature- 
based techniques such as pattern-matching regular expressions. They 
now obfuscate content, including packing/compression, script obfuscation, 
encryption and much more. This makes classic detection mechanisms 
extremely difficult, often requiring multiple signatures and in many cases, 
only detecting a subset of the malicious content. 


This is where machine learning and statistical data modeling become so 
effective. At a high level, machine learning works by training a machine 

by extracting “feature vectors" from a dataset of benign and malicious 
examples in order to compute a mathematical model. This model is 
evaluated against network traffic and, in the case of the TippingPoint 
solutions, can make a real-time decision about whether the content appears 
to be benign or malicious. If the content is determined to be malicious, the 
TippingPoint solutions block the content from entering the network. DV 
filters developed using the mathematical models operate without affecting 
network performance and without introducing a high amount of false 
positives. 


Trend Micro TippingPoint also uses machine learning to detect Domain 
Generation Algorithms (DGAs) used in many malware families (e.g. 
Conficker) to randomly generate domain names in order to contact their 
command and control (CnC) servers. TippingPoint Threat DV DGA filters 
include classifiers, developed using machine learning techniques across 
a significant DNS datasheet, that can detect families of DGAs using a 
combination of syntactical rules and logistic regression with over 95% 








accuracy. DGA filters are also in place to catch many types of malware whose 


domain names cannot be encompassed by a regular expression that would 
not generate a large number of false positives. 
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Powered by XGen™ security 
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